ACL.2019 - Student Research Workshop | Cool Papers

#1 Distributed Knowledge Based Clinical Auto-Coding System [PDF] [Copy] [Kimi¹]

Codification of free-text clinical narratives have long been recognised to be beneficial for secondary uses such as funding, insurance claim processing and research. In recent years, many researchers have studied the use of Natural Language Processing (NLP), related Machine Learning (ML) methods and techniques to resolve the problem of manual coding of clinical narratives. Most of the studies are focused on classification systems relevant to the U.S and there is a scarcity of studies relevant to Australian classification systems such as ICD-10-AM and ACHI. Therefore, we aim to develop a knowledge-based clinical auto-coding system, that utilise appropriate NLP and ML techniques to assign ICD-10-AM and ACHI codes to clinical records, while adhering to both local coding standards (Australian Coding Standard) and international guidelines that get updated and validated continuously.

#2 Robust to Noise Models in Natural Language Processing Tasks [PDF] [Copy] [Kimi¹]

Author: Valentin Malykh

There are a lot of noise texts surrounding a person in modern life. The traditional approach is to use spelling correction, yet the existing solutions are far from perfect. We propose robust to noise word embeddings model, which outperforms existing commonly used models, like fasttext and word2vec in different tasks. In addition, we investigate the noise robustness of current models in different natural language processing tasks. We propose extensions for modern models in three downstream tasks, i.e. text classification, named entity recognition and aspect extraction, which shows improvement in noise robustness over existing solutions.

#3 A Computational Linguistic Study of Personal Recovery in Bipolar Disorder [PDF] [Copy] [Kimi¹]

Author: Glorianna Jagfeld

Mental health research can benefit increasingly fruitfully from computational linguistics methods, given the abundant availability of language data in the internet and advances of computational tools. This interdisciplinary project will collect and analyse social media data of individuals diagnosed with bipolar disorder with regard to their recovery experiences. Personal recovery - living a satisfying and contributing life along symptoms of severe mental health issues - so far has only been investigated qualitatively with structured interviews and quantitatively with standardised questionnaires with mainly English-speaking participants inWestern countries. Complementary to this evidence, computational linguistic methods allow us to analyse first-person accounts shared online in large quantities, representing unstructured settings and a more heterogeneous, multilingual population, to draw a more complete picture of the aspects and mechanisms of personal recovery in bipolar disorder.

#4 Measuring the Value of Linguistics: A Case Study from St. Lawrence Island Yupik [PDF] [Copy] [Kimi¹]

Author: Emily Chen

The adaptation of neural approaches to NLP is a landmark achievement that has called into question the utility of linguistics in the development of computational systems. This research proposal consequently explores this question in the context of a neural morphological analyzer for a polysynthetic language, St. Lawrence Island Yupik. It asks whether incorporating elements of Yupik linguistics into the implementation of the analyzer can improve performance, both in low-resource settings and in high-resource settings, where rich quantities of data are readily available.

#5 Not All Reviews Are Equal: Towards Addressing Reviewer Biases for Opinion Summarization [PDF] [Copy] [Kimi¹]

Author: Wenyi Tay

Consumers read online reviews for insights which help them to make decisions. Given the large volumes of reviews, succinct review summaries are important for many applications. Existing research has focused on mining for opinions from only review texts and largely ignores the reviewers. However, reviewers have biases and may write lenient or harsh reviews; they may also have preferences towards some topics over others. Therefore, not all reviews are equal. Ignoring the biases in reviews can generate misleading summaries. We aim for summarization of reviews to include balanced opinions from reviewers of different biases and preferences. We propose to model reviewer biases from their review texts and rating distributions, and learn a bias-aware opinion representation. We further devise an approach for balanced opinion summarization of reviews using our bias-aware opinion representation.

#6 Towards Turkish Abstract Meaning Representation [PDF] [Copy] [Kimi¹]

Authors: Zahra Azin ; Gülşen Eryiğit

Using rooted, directed and labeled graphs, Abstract Meaning Representation (AMR) abstracts away from syntactic features such as word order and does not annotate every constituent in a sentence. AMR has been specified for English and was not supposed to be an Interlingua. However, several studies strived to overcome divergences in the annotations between English AMRs and those of their target languages by refining the annotation specification. Following this line of research, we have started to build the first Turkish AMR corpus by hand-annotating 100 sentences of the Turkish translation of the novel “The Little Prince” and comparing the results with the English AMRs available for the same corpus. The next step is to prepare the Turkish AMR annotation specification for training future annotators.

#7 Gender Stereotypes Differ between Male and Female Writings [PDF] [Copy] [Kimi¹]

Author: Yusu Qian

Written language often contains gender stereotypes, typically conveyed unintentionally by the author. To study the difference in how female and male authors portray people of different genders, we quantitatively evaluate and analyze the gender stereotypes in their writings on two different datasets and from multiple aspects. We show that writings by females on average have lower gender stereotype scores. We plan to study and interpret the distributions of gender stereotype scores of individual words, and how they differ between male and female writings. We also plan on using more datasets over the past century to study how the stereotypes in female and male writings evolved over time.

#8 Question Answering in the Biomedical Domain [PDF] [Copy] [Kimi¹]

Author: Vincent Nguyen

Question answering techniques have mainly been investigated in open domains. However, there are particular challenges in extending these open-domain techniques to extend into the biomedical domain. Question answering focusing on patients is less studied. We find that there are some challenges in patient question answering such as limited annotated data, lexical gap and quality of answer spans. We aim to address some of these gaps by extending and developing upon the literature to design a question answering system that can decide on the most appropriate answers for patients attempting to self-diagnose while including the ability to abstain from answering when confidence is low.

#9 Knowledge Discovery and Hypothesis Generation from Online Patient Forums: A Research Proposal [PDF] [Copy] [Kimi¹]

Author: Anne Dirkson

The unprompted patient experiences shared on patient forums contain a wealth of unexploited knowledge. Mining this knowledge and cross-linking it with biomedical literature, could expose novel insights, which could subsequently provide hypotheses for further clinical research. As of yet, automated methods for open knowledge discovery on patient forum text are lacking. Thus, in this research proposal, we outline future research into methods for mining, aggregating and cross-linking patient knowledge from online forums. Additionally, we aim to address how one could measure the credibility of this extracted knowledge.

#10 Automated Cross-language Intelligibility Analysis of Parkinson’s Disease Patients Using Speech Recognition Technologies [PDF] [Copy] [Kimi¹]

Authors: Nina Hosseini-Kivanani ; Juan Camilo Vásquez-Correa ; Manfred Stede ; Elmar Nöth

Speech deficits are common symptoms amongParkinson’s Disease (PD) patients. The automatic assessment of speech signals is promising for the evaluation of the neurological state and the speech quality of the patients. Recently, progress has been made in applying machine learning and computational methods to automatically evaluate the speech of PD patients. In the present study, we plan to analyze the speech signals of PD patients and healthy control (HC) subjects in three different languages: German, Spanish, and Czech, with the aim to identify biomarkers to discriminate between PD patients and HC subjects and to evaluate the neurological state of the patients. Therefore, the main contribution of this study is the automatic classification of PD patients and HC subjects in different languages with focusing on phonation, articulation, and prosody. We will focus on an intelligibility analysis based on automatic speech recognition systems trained on these three languages. This is one of the first studies done that considers the evaluation of the speech of PD patients in different languages. The purpose of this research proposal is to build a model that can discriminate PD and HC subjects even when the language used for train and test is different.

#11 Natural Language Generation: Recently Learned Lessons, Directions for Semantic Representation-based Approaches, and the Case of Brazilian Portuguese Language [PDF] [Copy] [Kimi¹]

Authors: Marco Antonio Sobrevilla Cabezudo ; Thiago Pardo

This paper presents a more recent literature review on Natural Language Generation. In particular, we highlight the efforts for Brazilian Portuguese in order to show the available resources and the existent approaches for this language. We also focus on the approaches for generation from semantic representations (emphasizing the Abstract Meaning Representation formalism) as well as their advantages and limitations, including possible future directions.

#12 Long-Distance Dependencies Don’t Have to Be Long: Simplifying through Provably (Approximately) Optimal Permutations [PDF] [Copy] [Kimi¹]

Author: Rishi Bommasani

Neural models at the sentence level often operate on the constituent words/tokens in a way that encodes the inductive bias of processing the input in a similar fashion to how humans do. However, there is no guarantee that the standard ordering of words is computationally efficient or optimal. To help mitigate this, we consider a dependency parse as a proxy for the inter-word dependencies in a sentence and simplify the sentence with respect to combinatorial objectives imposed on the sentence-parse pair. The associated optimization results in permuted sentences that are provably (approximately) optimal with respect to minimizing dependency parse lengths and that are demonstrably simpler. We evaluate our general-purpose permutations within a fine-tuning schema for the downstream task of subjectivity analysis. Our fine-tuned baselines reflect a new state of the art for the SUBJ dataset and the permutations we introduce lead to further improvements with a 2.0% increase in classification accuracy (absolute) and a 45% reduction in classification error (relative) over the previous state of the art.

#13 Predicting the Outcome of Deliberative Democracy: A Research Proposal [PDF] [Copy] [Kimi¹]

Author: Conor McKillop

As liberal states across the world face a decline in political participation by citizens, deliberative democracy is a promising solution for the public’s decreasing confidence and apathy towards the democratic process. Deliberative dialogue is method of public interaction that is fundamental to the concept of deliberative democracy. The ability to identify and predict consensus in the dialogues could bring greater accessibility and transparency to the face-to-face participatory process. The paper sets out a research plan for the first steps at automatically identifying and predicting consensus in a corpus of German language debates on hydraulic fracking. It proposes the use of a unique combination of lexical, sentiment, durational and further ‘derivative’ features of adjacency pairs to train traditional classification models. In addition to this, the use of deep learning techniques to improve the accuracy of the classification and prediction tasks is also discussed. Preliminary results at the classification of utterances are also presented, with an F1 between 0.61 and 0.64 demonstrating that the task of recognising agreement is demanding but possible.

#14 Active Reading Comprehension: A Dataset for Learning the Question-Answer Relationship Strategy [PDF] [Copy] [Kimi¹]

Author: Diana Galván-Sosa

Reading comprehension (RC) through question answering is a useful method for evaluating if a reader understands a text. Standard accuracy metrics are used for evaluation, where high accuracy is taken as indicative of a good understanding. However, literature in quality learning suggests that task performance should also be evaluated on the undergone process to answer. The Question-Answer Relationship (QAR) is one of the strategies for evaluating a reader’s understanding based on their ability to select different sources of information depending on the question type. We propose the creation of a dataset to learn the QAR strategy with weak supervision. We expect to complement current work on reading comprehension by introducing a new setup for evaluation.

#15 Paraphrases as Foreign Languages in Multilingual Neural Machine Translation [PDF] [Copy] [Kimi¹]

Authors: Zhong Zhou ; Matthias Sperber ; Alexander Waibel

Paraphrases, rewordings of the same semantic meaning, are useful for improving generalization and translation. Unlike previous works that only explore paraphrases at the word or phrase level, we use different translations of the whole training data that are consistent in structure as paraphrases at the corpus level. We treat paraphrases as foreign languages, tag source sentences with paraphrase labels, and train on parallel paraphrases in the style of multilingual Neural Machine Translation (NMT). Our multi-paraphrase NMT that trains only on two languages outperforms the multilingual baselines. Adding paraphrases improves the rare word translation and increases entropy and diversity in lexical choice. Adding the source paraphrases boosts performance better than adding the target ones, while adding both lifts performance further. We achieve a BLEU score of 57.2 for French-to-English translation using 24 corpus-level paraphrases of the Bible, which outperforms the multilingual baselines and is +34.7 above the single-source single-target NMT baseline.

#16 Improving Mongolian-Chinese Neural Machine Translation with Morphological Noise [PDF] [Copy] [Kimi¹]

Authors: Yatu Ji ; Hongxu Hou ; Chen Junjie ; Nier Wu

For the translation of agglutinative language such as typical Mongolian, unknown (UNK) words not only come from the quite restricted vocabulary, but also mostly from misunderstanding of the translation model to the morphological changes. In this study, we introduce a new adversarial training model to alleviate the UNK problem in Mongolian-Chinese machine translation. The training process can be described as three adversarial sub models (generator, value screener and discriminator), playing a win-win game. In this game, the added screener plays the role of emphasizing that the discriminator pays attention to the added Mongolian morphological noise in the form of pseudo-data and improving the training efficiency. The experimental results show that the newly emerged Mongolian-Chinese task is state-of-the-art. Under this premise, the training time is greatly shortened.

#17 Unsupervised Pretraining for Neural Machine Translation Using Elastic Weight Consolidation [PDF] [Copy] [Kimi¹]

Authors: Dušan Variš ; Ondřej Bojar

This work presents our ongoing research of unsupervised pretraining in neural machine translation (NMT). In our method, we initialize the weights of the encoder and decoder with two language models that are trained with monolingual data and then fine-tune the model on parallel data using Elastic Weight Consolidation (EWC) to avoid forgetting of the original language modeling task. We compare the regularization by EWC with the previous work that focuses on regularization by language modeling objectives. The positive result is that using EWC with the decoder achieves BLEU scores similar to the previous work. However, the model converges 2-3 times faster and does not require the original unlabeled training data during the fine-tuning stage. In contrast, the regularization using EWC is less effective if the original and new tasks are not closely related. We show that initializing the bidirectional NMT encoder with a left-to-right language model and forcing the model to remember the original left-to-right language modeling task limits the learning capacity of the encoder for the whole bidirectional context.

#18 Māori Loanwords: A Corpus of New Zealand English Tweets [PDF] [Copy] [Kimi¹]

Authors: David Trye ; Andreea Calude ; Felipe Bravo-Marquez ; Te Taka Keegan

Māori loanwords are widely used in New Zealand English for various social functions by New Zealanders within and outside of the Māori community. Motivated by the lack of linguistic resources for studying how Māori loanwords are used in social media, we present a new corpus of New Zealand English tweets. We collected tweets containing selected Māori words that are likely to be known by New Zealanders who do not speak Māori. Since over 30% of these words turned out to be irrelevant, we manually annotated a sample of our tweets into relevant and irrelevant categories. This data was used to train machine learning models to automatically filter out irrelevant tweets.

#19 Ranking of Potential Questions [PDF] [Copy] [Kimi¹]

Authors: Luise Schricker ; Tatjana Scheffler

Questions are an integral part of discourse. They provide structure and support the exchange of information. One linguistic theory, the Questions Under Discussion model, takes question structures as integral to the functioning of a coherent discourse. This theory has not been tested on the count of its validity for predicting observations in real dialogue data, however. In this submission, a system for ranking explicit and implicit questions by their appropriateness in a dialogue is presented. This system implements constraints and principles put forward in the linguistic literature.

#20 Controlling Grammatical Error Correction Using Word Edit Rate [PDF] [Copy] [Kimi¹]

Authors: Kengo Hotate ; Masahiro Kaneko ; Satoru Katsumata ; Mamoru Komachi

When professional English teachers correct grammatically erroneous sentences written by English learners, they use various methods. The correction method depends on how much corrections a learner requires. In this paper, we propose a method for neural grammar error correction (GEC) that can control the degree of correction. We show that it is possible to actually control the degree of GEC by using new training data annotated with word edit rate. Thereby, diverse corrected sentences is obtained from a single erroneous sentence. Moreover, compared to a GEC model that does not use information on the degree of correction, the proposed method improves correction accuracy.

#21 From Brain Space to Distributional Space: The Perilous Journeys of fMRI Decoding [PDF] [Copy] [Kimi¹]

Authors: Gosse Minnema ; Aurélie Herbelot

Recent work in cognitive neuroscience has introduced models for predicting distributional word meaning representations from brain imaging data. Such models have great potential, but the quality of their predictions has not yet been thoroughly evaluated from a computational linguistics point of view. Due to the limited size of available brain imaging datasets, standard quality metrics (e.g. similarity judgments and analogies) cannot be used. Instead, we investigate the use of several alternative measures for evaluating the predicted distributional space against a corpus-derived distributional space. We show that a state-of-the-art decoder, while performing impressively on metrics that are commonly used in cognitive neuroscience, performs unexpectedly poorly on our metrics. To address this, we propose strategies for improving the model’s performance. Despite returning promising results, our experiments also demonstrate that much work remains to be done before distributional representations can reliably be predicted from brain data.

#22 Towards Incremental Learning of Word Embeddings Using Context Informativeness [PDF] [Copy] [Kimi¹]

Authors: Alexandre Kabbach ; Kristina Gulordava ; Aurélie Herbelot

In this paper, we investigate the task of learning word embeddings from very sparse data in an incremental, cognitively-plausible way. We focus on the notion of ‘informativeness’, that is, the idea that some content is more valuable to the learning process than other. We further highlight the challenges of online learning and argue that previous systems fall short of implementing incrementality. Concretely, we incorporate informativeness in a previously proposed model of nonce learning, using it for context selection and learning rate modulation. We test our system on the task of learning new words from definitions, as well as on the task of learning new words from potentially uninformative contexts. We demonstrate that informativeness is crucial to obtaining state-of-the-art performance in a truly incremental setup.

#23 A Strong and Robust Baseline for Text-Image Matching [PDF] [Copy] [Kimi¹]

Authors: Fangyu Liu ; Rongtian Ye

We review the current schemes of text-image matching models and propose improvements for both training and inference. First, we empirically show limitations of two popular loss (sum and max-margin loss) widely used in training text-image embeddings and propose a trade-off: a kNN-margin loss which 1) utilizes information from hard negatives and 2) is robust to noise as all K-most hardest samples are taken into account, tolerating pseudo negatives and outliers. Second, we advocate the use of Inverted Softmax (IS) and Cross-modal Local Scaling (CSLS) during inference to mitigate the so-called hubness problem in high-dimensional embedding space, enhancing scores of all metrics by a large margin.

#24 Incorporating Textual Information on User Behavior for Personality Prediction [PDF] [Copy] [Kimi¹]

Authors: Kosuke Yamada ; Ryohei Sasano ; Koichi Takeda

Several recent studies have shown that textual information of user posts and user behaviors such as liking and sharing the specific posts are useful for predicting the personality of social media users. However, less attention has been paid to the textual information derived from the user behaviors. In this paper, we investigate the effect of textual information on user behaviors for personality prediction. Our experiments on the personality prediction of Twitter users show that the textual information of user behaviors is more useful than the co-occurrence information of the user behaviors. They also show that taking user behaviors into account is crucial for predicting the personality of users who do not post frequently.

#25 Corpus Creation and Analysis for Named Entity Recognition in Telugu-English Code-Mixed Social Media Data [PDF] [Copy] [Kimi¹]

Authors: Vamshi Krishna Srirangam ; Appidi Abhinav Reddy ; Vinay Singh ; Manish Shrivastava

Named Entity Recognition(NER) is one of the important tasks in Natural Language Processing(NLP) and also is a subtask of Information Extraction. In this paper we present our work on NER in Telugu-English code-mixed social media data. Code-Mixing, a progeny of multilingualism is a way in which multilingual people express themselves on social media by using linguistics units from different languages within a sentence or speech context. Entity Extraction from social media data such as tweets(twitter) is in general difficult due to its informal nature, code-mixed data further complicates the problem due to its informal, unstructured and incomplete information. We present a Telugu-English code-mixed corpus with the corresponding named entity tags. The named entities used to tag data are Person(‘Per’), Organization(‘Org’) and Location(‘Loc’). We experimented with the machine learning models Conditional Random Fields(CRFs), Decision Trees and BiLSTMs on our corpus which resulted in a F1-score of 0.96, 0.94 and 0.95 respectively.